Suggesting Sentences for ESL using Kernel Embeddings

نویسندگان

  • Kent Shioda
  • Mamoru Komachi
  • Rue Ikeya
  • Daichi Mochihashi
چکیده

Sentence retrieval is an important NLP application for English as a Second Language (ESL) learners. ESL learners are familiar with web search engines, but generic web search results may not be adequate for composing documents in a specific domain. However, if we build our own search system specialized to a domain, it may be subject to the data sparseness problem. Recently proposed word2vec partially addresses the data sparseness problem, but fails to extract sentences relevant to queries owing to the modeling of the latent intent of the query. Thus, we propose a method of retrieving example sentences using kernel embeddings and N-gram windows. This method implicitly models latent intent of query and sentences, and alleviates the problem of noisy alignment. Our results show that our method achieved higher precision in sentence retrieval for ESL in the domain of a university press release corpus, as compared to a previous unsupervised method used for a semantic textual similarity task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Primacy of Teacher Imperative Commentaries in the Improvement of Iranian English Majors’ Writing Ability

In this study, the researchers investigated a critical aspect of EFL/ESL writing pedagogy-the impact of teacher written commentary on student writer’s earlier drafts. Compositions of 80 Iranian undergraduate English majors were commented on using a trio of imperatives, statements, and questions on both content and form. Overall, the results indicated that the comments in the imperative form hel...

متن کامل

Comparing Acceptability in Magnitude Estimation Tests to an Unsupervised Model of Language Acquisition

Traditionally language models have been evaluated by testing their ability to mark sentences as grammatical or ungrammatical. But with the emergence of probabilistic, connectionist models etc. on the computational side and magnitude estimation tests etc., on the linguistic side, it might make sense to go all the way and evaluate the models graded predictions. We present a language acquisition a...

متن کامل

Automatic Extraction of Learner Errors in ESL Sentences Using Linguistically Enhanced Alignments

We propose a new method of automatically extracting learner errors from parallel English as a Second Language (ESL) sentences in an effort to regularise annotation formats and reduce inconsistencies. Specifically, given an original and corrected sentence, our method first uses a linguistically enhanced alignment algorithm to determine the most likely mappings between tokens, and secondly employ...

متن کامل

Bilingual Random Walk Models for Automated Grammar Correction of ESL Author-Produced Text

We present a novel noisy channel model for correcting text produced by English as a second language (ESL) authors. We model the English word choices made by ESL authors as a random walk across an undirected bipartite dictionary graph composed of edges between English words and associated words in an author’s native language. We present two such models, using cascades of weighted finitestate tra...

متن کامل

Monte Carlo Filtering Using Kernel Embedding of Distributions

Recent advances of kernel methods have yielded a framework for representing probabilities using a reproducing kernel Hilbert space, called kernel embedding of distributions. In this paper, we propose a Monte Carlo filtering algorithm based on kernel embeddings. The proposed method is applied to state-space models where sampling from the transition model is possible, while the observation model ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017